Serveur d'exploration sur la musique en Sarre

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Using anchor text for homepage and topic distillation search tasks

Identifieur interne : 000206 ( Main/Exploration ); précédent : 000205; suivant : 000207

Using anchor text for homepage and topic distillation search tasks

Auteurs : Mingfang Wu [Australie] ; David Hawking ; Andrew Turpin [Australie] ; Falk Scholer [Australie]

Source :

RBID : ISTEX:363F0E1D5823CEC7329BFE6C55F13091838DD671

English descriptors

Abstract

Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document's relevance score or rank position, and combining term frequency from both representations during the retrieval process. Although these approaches have each been tested and compared against baselines, different evaluations have used different baselines; no consistent work enables rigorous cross‐comparison between these methods. The purpose of this work is threefold. First, we survey existing fusion methods of using anchor text in search. Second, we compare these methods with common testbeds and web search tasks, with the aim of identifying the most effective fusion method. Third, we try to correlate search performance with the characteristics of a test collection. Our experimental results show that the best performing method in each category can significantly improve search results over a common baseline. However, there is no single technique that consistently outperforms competing approaches across different collections and search tasks.

Url:
DOI: 10.1002/asi.22639


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Using anchor text for homepage and topic distillation search tasks</title>
<author>
<name sortKey="Wu, Mingfang" sort="Wu, Mingfang" uniqKey="Wu M" first="Mingfang" last="Wu">Mingfang Wu</name>
</author>
<author>
<name sortKey="Hawking, David" sort="Hawking, David" uniqKey="Hawking D" first="David" last="Hawking">David Hawking</name>
</author>
<author>
<name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
</author>
<author>
<name sortKey="Scholer, Falk" sort="Scholer, Falk" uniqKey="Scholer F" first="Falk" last="Scholer">Falk Scholer</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:363F0E1D5823CEC7329BFE6C55F13091838DD671</idno>
<date when="2012" year="2012">2012</date>
<idno type="doi">10.1002/asi.22639</idno>
<idno type="url">https://api.istex.fr/document/363F0E1D5823CEC7329BFE6C55F13091838DD671/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000560</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000560</idno>
<idno type="wicri:Area/Istex/Curation">000529</idno>
<idno type="wicri:Area/Istex/Checkpoint">000076</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000076</idno>
<idno type="wicri:doubleKey">1532-2882:2012:Wu M:using:anchor:text</idno>
<idno type="wicri:Area/Main/Merge">000206</idno>
<idno type="wicri:Area/Main/Curation">000206</idno>
<idno type="wicri:Area/Main/Exploration">000206</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Using anchor text for homepage and topic distillation search tasks</title>
<author>
<name sortKey="Wu, Mingfang" sort="Wu, Mingfang" uniqKey="Wu M" first="Mingfang" last="Wu">Mingfang Wu</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Australie</country>
<wicri:regionArea>School of Computer Science and Information Technology, RMIT University, Melbourne</wicri:regionArea>
<placeName>
<settlement type="city">Melbourne</settlement>
<region type="état">Victoria (État)</region>
</placeName>
</affiliation>
<affiliation></affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Australie</country>
</affiliation>
</author>
<author>
<name sortKey="Hawking, David" sort="Hawking, David" uniqKey="Hawking D" first="David" last="Hawking">David Hawking</name>
<affiliation></affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: david.hawking@funnelback.com</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Australie</country>
<wicri:regionArea>Department of Computer Science and Software Engineering, University of Melbourne</wicri:regionArea>
<placeName>
<settlement type="city">Melbourne</settlement>
<region type="état">Victoria (État)</region>
</placeName>
<orgName type="university">Université de Melbourne</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Australie</country>
</affiliation>
</author>
<author>
<name sortKey="Scholer, Falk" sort="Scholer, Falk" uniqKey="Scholer F" first="Falk" last="Scholer">Falk Scholer</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Australie</country>
<wicri:regionArea>School of Computer Science and Information Technology, RMIT University, Melbourne</wicri:regionArea>
<placeName>
<settlement type="city">Melbourne</settlement>
<region type="état">Victoria (État)</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Australie</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Journal of the American Society for Information Science and Technology</title>
<title level="j" type="abbrev">J Am Soc Inf Sci Tec</title>
<idno type="ISSN">1532-2882</idno>
<idno type="eISSN">1532-2890</idno>
<imprint>
<publisher>Blackwell Publishing Ltd</publisher>
<date type="published" when="2012-06">2012-06</date>
<biblScope unit="volume">63</biblScope>
<biblScope unit="issue">6</biblScope>
<biblScope unit="page" from="1235">1235</biblScope>
<biblScope unit="page" to="1255">1255</biblScope>
</imprint>
<idno type="ISSN">1532-2882</idno>
</series>
<idno type="istex">363F0E1D5823CEC7329BFE6C55F13091838DD671</idno>
<idno type="DOI">10.1002/asi.22639</idno>
<idno type="ArticleID">ASI22639</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1532-2882</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="Teeft" xml:lang="en">
<term>Aggregated anchor text</term>
<term>American society</term>
<term>Anchor</term>
<term>Anchor text</term>
<term>Anchor text collection</term>
<term>Anchor text entries</term>
<term>Anchor text representation</term>
<term>Average precision</term>
<term>Baseline</term>
<term>Belkin</term>
<term>Best fusion methods</term>
<term>Best performance</term>
<term>Borda</term>
<term>Callan</term>
<term>Candidate lists</term>
<term>Cerc</term>
<term>Cerc collection</term>
<term>Clueweb collection</term>
<term>Combmax</term>
<term>Combmnz</term>
<term>Computer science</term>
<term>Craswell</term>
<term>Croft</term>
<term>Different document representations</term>
<term>Different query representations</term>
<term>Distillation</term>
<term>Document</term>
<term>Document collections</term>
<term>Document frequency</term>
<term>Document relevance score</term>
<term>Document representation</term>
<term>Document representations</term>
<term>Document retrieval</term>
<term>Eld</term>
<term>Elnorm</term>
<term>Entry page</term>
<term>Experimental results</term>
<term>Fusion</term>
<term>Fusion method</term>
<term>Fusion methods</term>
<term>Gaithersburg</term>
<term>Homepage</term>
<term>Hyperlink</term>
<term>Incoming links</term>
<term>Indegree</term>
<term>Individual relevance scores</term>
<term>Information need</term>
<term>Information objects</term>
<term>Information processing</term>
<term>Information retrieval</term>
<term>Information science</term>
<term>Information systems</term>
<term>Ingwersen</term>
<term>International conference</term>
<term>International world</term>
<term>Interpolation</term>
<term>Irrelevant documents</term>
<term>Jaccard</term>
<term>Jaccard similarity</term>
<term>Knowledge management</term>
<term>Language model</term>
<term>Language models</term>
<term>Larsen</term>
<term>Linear combination</term>
<term>Linear combination method</term>
<term>Linear interpolation</term>
<term>Linear interpolation method</term>
<term>Linear normalization</term>
<term>Lnorm</term>
<term>Median</term>
<term>Method normalization</term>
<term>Mnorm</term>
<term>Multiple query representations</term>
<term>National institute</term>
<term>Nding</term>
<term>Nding task</term>
<term>Nding tasks</term>
<term>Nnorm</term>
<term>Nnorm lnorm elnorm mnorm</term>
<term>Normalization</term>
<term>Normalization method</term>
<term>Ogilvie</term>
<term>Ogilvie callan</term>
<term>Okapi</term>
<term>Original text</term>
<term>Outgoing links</term>
<term>Performance difference</term>
<term>Query</term>
<term>Query representation</term>
<term>Query term</term>
<term>Rank position</term>
<term>Reciprocal rank</term>
<term>Relevance</term>
<term>Relevance assessments</term>
<term>Relevance score</term>
<term>Relevance scores</term>
<term>Relevant answer</term>
<term>Relevant document</term>
<term>Relevant documents</term>
<term>Relevant pages</term>
<term>Representation</term>
<term>Retrieval</term>
<term>Retrieval system</term>
<term>Robertson</term>
<term>Search engine</term>
<term>Search performance</term>
<term>Search quality</term>
<term>Search result</term>
<term>Search result fusion</term>
<term>Search results</term>
<term>Search results list</term>
<term>Search task</term>
<term>Search tasks</term>
<term>Sigir</term>
<term>Sigir conference</term>
<term>Similarity score</term>
<term>Similarity scores</term>
<term>Single retrieval system</term>
<term>Success rate</term>
<term>Target document</term>
<term>Target page</term>
<term>Term frequency</term>
<term>Term frequency combination method</term>
<term>Test collections</term>
<term>Testbed</term>
<term>Testbeds</term>
<term>Text retrieval conference</term>
<term>Topic</term>
<term>Topic distillation</term>
<term>Topic distillation task</term>
<term>Trec</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract">Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document's relevance score or rank position, and combining term frequency from both representations during the retrieval process. Although these approaches have each been tested and compared against baselines, different evaluations have used different baselines; no consistent work enables rigorous cross‐comparison between these methods. The purpose of this work is threefold. First, we survey existing fusion methods of using anchor text in search. Second, we compare these methods with common testbeds and web search tasks, with the aim of identifying the most effective fusion method. Third, we try to correlate search performance with the characteristics of a test collection. Our experimental results show that the best performing method in each category can significantly improve search results over a common baseline. However, there is no single technique that consistently outperforms competing approaches across different collections and search tasks.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Australie</li>
</country>
<region>
<li>Victoria (État)</li>
</region>
<settlement>
<li>Melbourne</li>
</settlement>
<orgName>
<li>Université de Melbourne</li>
</orgName>
</list>
<tree>
<noCountry>
<name sortKey="Hawking, David" sort="Hawking, David" uniqKey="Hawking D" first="David" last="Hawking">David Hawking</name>
</noCountry>
<country name="Australie">
<region name="Victoria (État)">
<name sortKey="Wu, Mingfang" sort="Wu, Mingfang" uniqKey="Wu M" first="Mingfang" last="Wu">Mingfang Wu</name>
</region>
<name sortKey="Scholer, Falk" sort="Scholer, Falk" uniqKey="Scholer F" first="Falk" last="Scholer">Falk Scholer</name>
<name sortKey="Scholer, Falk" sort="Scholer, Falk" uniqKey="Scholer F" first="Falk" last="Scholer">Falk Scholer</name>
<name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<name sortKey="Wu, Mingfang" sort="Wu, Mingfang" uniqKey="Wu M" first="Mingfang" last="Wu">Mingfang Wu</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000206 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000206 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:363F0E1D5823CEC7329BFE6C55F13091838DD671
   |texte=   Using anchor text for homepage and topic distillation search tasks
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024